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1.  INTRODUCTION 


There   are   several   recently    proposed   classes   of 

empirical    probability     density   function   (  1,4,5,7]   all 

generally   considered   to   be   superior   to   the    classical 

histogram   estimates.   The  class  considered  in  this  paper  is 

based  on  independent   observations,   i.e.   X  ,X  ,...,X    are 

12       n 

independent   and   identically   distributed   random  variables 

with  continuous  unknown  density  function  f  (x) .    The   method 

used   to   estimate   f (x)   is   that   proposed   by  Rosenblatt; 

denoting  the  estimate  by  f  (x) ,  we  define 

n 

n    rx  -  X  t 
f  (x)  =    1      £     W|      j   , 

where  W (u)   is   a   bounded   non-negative   integrable   weight 
function  with 


f 


W (u) du  =  1, 


and   b  (n)   is   a   positive  bandwidth  function  which  tends  to 
zero  as  n  -->  °°  ,  but  is  such  that  o[  b  (n)  ]  =  1  /  n .   Thus  we 


-1/2 

might  have  b  (n)  *  n     ,  for  example. 


We  note  that  all  estimates  of  this  form  are   themselves 
density  functions  for  a  given  set  of  observations;  that  is, 


L 


f  (x)  >  0, 
n 


f  (x) dx  =  1  . 

n 


Since   the   X  's  are  random  variables,  f  (x)  is  a  continuous 
j  n 

parameter    stochastic    process,    but    it    is     clearly 
non-stationary. 


The  estimate  f  (x)  can  be  shown  to   be   locally   biased 
n 

for   any   value   of   x  under  relatively  mild  conditions  [4], 

Our  object  in  this  paper  is  to  investigate  a  global   measure 

of   how   good   f  (x)  is  as  an  estimate  of  f  (x) .   The  measure 

n 

was  originally  proposed  by  Bickel  and  Rosenblatt  [2]  and   is 
given  by 

f  (x)  -  f(x),2 


6(n)  =  /  ±_n _i_  dx  . 


Since   the  value  of   3 (n)  will  vary  with  each  realization  of 

X  ,...,X  ,  it  is  a  statistic  or  function   of   the   n   random 
1       n 

variables.    A   possible   application   for   such  a  statistic 

would  be  in  goodness-of-f it   type   tests,   in   an   analogous 
manner  to  the  more  familiar  Kolmogorov-Smirnov  test. 


Bickel  and  Rosenblatt   [2]   have   established   that   if 

-2/9 
b  (n)   =   o[  n     ]   as   n   -->  °°    and  if  a  (x)  is  a  bounded, 

piecewise  smooth  integrable  function  then 

b(n)     [nb(n)  /[  f  (x) -f  (x)  ]2a  (x)  dx   -  ft  (x)  a  (x)  dx  JV  (z)  2dz  ] 

is  asymptotically  normally  distributed  with   zero   mean   and 
variance 


f 


2W<*  >  (0)  /a  (x)  2f  (X)  2dx  , 

as  n  -->  °°  ,  where  W<*>(0)  is  the  fourth  convolution  of  W 
with  itself.  Thus,  B  (n)  has  an  asymptotically  normal 
distribution,  regardless  of  the  underlying  density  f  (x) . 

A   problem   in   this   situation   is   that,   unlike  the 
Kolmogorov-Smirnov   test   statistic,   the  statistic  6  (n)  is 


not  distribution-free.  Farther,  its  exact  distribution  for 
any  finite  value  of  n  does  not  seem  to  be  mathematically 
tractable.  We  thus  exanined  some  representative  cases 
through  simulation,  hoping  that  6  (n)  would  be  fairlv  robust 
with  rapid  convergence  to  the  asymptotic  distribution.  It 
was  also  hoped  that  the  simulations  would  cast  light  on 
these  conjectures  and  perhaps  suggest  some  unexpected 
results. 


2.  SIMULATION 


The  primary  object  of  the  simulation  was  to  investigate 
the  distribution  of  the  statistic   6(n) : 

C    ,f      (X)   -  f  (X),2 

6(n)  =   /  L_n         _i   dx  , 

over  a  suitable  range  of  integration.  We  performed 
simulations  with  synthetic  sampling  from  both  uniform  and 
Cauchy  distributions;  the  triangular  weight  function 


H(u)  =| 


1  -  |u|,      if  | u |  <  1 

0       ,       otherwise 


was   used   to  evaluate  f  (x)  in  both  cases.   We  found  little 

n 

difference   as   far   as    3(n)   was   concerned   between  the 

triangular   and   other   "smoother"   (e.g.,  guadratic)  weignt 
functions  for  our  samples  of  from  100  to  1500  deviates. 


A.  UNIFORM  RANDOM  VARIABLES 

In  the  case  of  uniform  (0,1)  random  variables,  we  have 

(1,        0  <  x  <  1 
f (x)  =  { 

v  0 ,        otherwise  . 

Thus,   3 (n)  becomes, 


/l-b(n) 

3  (n)  =/       [f  (x)  -  1]2dx  .  (2.1) 

J   b (n)     n 


The  limits  of  integration  are  from  b(n)   to   1      b(n) 

instead   of   from   0   to  1  to  avoid  the  marked  bias  of  f  (x) 

n 

near  0  and  1.   As  long  as  b (n)  <  x  <  1-b(n),   though,   f  (x) 

n 


is  unbiased: 


E[f  (x)  1  =   1 
n       d  ( 


/Wrx-y  i  dy 
A+b  (n) 

1   /      r1  -  lx"Ui 

7nTy  x-b(n)  »-      ETnfJ 

ffI,b,B|  dy  -  f* 
nf  k/  x-b(n)      J    x-b  (n 

/x  +  b(n|  , 


.1  -  iSZZli  dy 

=   1     /        dy  -  /  x-y  dy 

BlnJT  Ly  x-b(n)       /  x-b(n)    B]n[ 


=  1  . 

Also,  for  the  same  range  of  x, 

n   rx  -  X 


Var[f  (x)]  =  Var  I    1    .£  W|      II  I 

1      5"  Var  wT*     jl 
n*37hTz  j  =  1      «-~B7nT  ^ 


rx  -  X  , 
1     Var  W       i 
nBTnf?       L"BTnr  J 


1     (/   W2rx-y  t  dy  -  \   f   Wrx-y  ,  dy  ] 
nBTnf7  *-J   o    »-57np       l./  o  LB^nf3     J   J 


Since   f  (x)   is   a   piecewise   linear   function  when  a 
n 

triangular  weight  function  is  used,  the   integral   in   (2.1) 

can    be    evaluated   in   principle   but   the   woe*   becomes 

prohibitive   for   even   moderate   sample   sizes.    We    thus 

approximated   the   integral   using   Simpson's   rule  with  100 

egual   subintervals.    The   results    were    found    to   be 

satisfactory   in   the   sense   that   the  value  did  not  change 

appreciably  when  a  finer  grid  (up  to  500   subintervals)   was 

used.    In   general,  ve      found   that   a   larger  sample  size 

required  a  finer  grid;  apparently  the  value  of  f  (x)  changes 

n 


more  rapidly  over  a  small  interval  when  n  is  large. 

We  used  three  different  bandwidths  in  the  uniform  case: 

1/2  1/2 

3   /  n    ,   1   /  n    and  1  /  n.   For  each  bandwLdth  sample 

sizes  of  100,  200,  500,  1000  and  1500  were  investigated  so 
that  a  total  of  15  experiments  were  carried  out.  Each 
experiment  consisted  of  2000  independent  replications  each 
of  which  resulted  in  the  calculation  of  a  singLe  value  of 
3  (n)  using  (2.1).  The  replications  for  a  given  experiment 
were  divided  into  five  sections  of  400  observations  each  so 
that  variability  of  the  simulation  results  could  be  assessed 
between  sections. 

Besides  the  400  observed  values  of  3  (n) ,  the  computer 
output  for  each  of  the  75  sections  included  a  histogram,  an 
empirical  log-survivor  function  plot,  an  empirical  CDF  plot 
and  a  normal  probability  plot.  A  histogram  and  an  empirical 
log-survivor  plot  were  also  computed  for  the  pooled  sample 
of  2000  for  each  experiment.  These  plots  are  all  reproduced 
in  reference  [3];  some  of  the  more  interesting  cases  are 
included  in  Section  4. 

It  was  found  that  a  better  picture  of  the  distribution 
of  the  data  resulted  when  the  empirical  density  function  of 
the  3  (n)'s  was  plotted  over  the  histogram  plot.  A  fairly 
wide   bandwidth  was  needed  to  suppress  large  fluctuations  in 

1/2 

f  (x) ;  it  was  found  that  b(n)  =  R  /  n     was  a  fairly  robust 
n 

choice.    (R   denotes   the   sample   range   [maximum   value  - 

minimum  value]  of  the   3 (n)  sample.)   The  solid  lines  in  the 

Figures   in   Section  4  are  empirical  density  estimates  using 

this  bandwidth  and  the  triangular  weight  function. 


B.        CAOCHY    RANDOM    VARIABLES 


The    Cauchy    density    function    is 

f(x)     =  1 

TTT+x*r 

We  used  the  same  density  estimator  as  in  the  uniform  case: 

n    rx  -  X  -i 
f  (x)  =    1      Z     W|      j|  , 
n      n5TnF   j=1   L~5"fhT  J 

and  again  the  triangular  weight  function.   Ke  chose  a   range 
of  integration  (-3, +3) : 


e 


f+3   [f  (x)  -  f  (X)  p 

(n)  =  I       n  dx  . 


This  range  comprises  80£  of  the  probability  mass  for  this 
distribution.  Again,  Simpson's  rule  was  used  to  approximate 
the  integral;  in  this  case  a  grid  of  600  subintervals  was 
selected  after  examining  100,  300,  600  and  900  subinterval 
grids . 

The  Cauchy  distribution  was  chosen  because  far  finite  n 

f  (x)  has  a  bias  component;  this  component  usually  decreases 
n 

with   bandwidth   for   a   fixed   value   of   n,   although   the 

pointwise    variance   of   f  (x)   increases   with   decreasing 

n 

bandwidth.   It  seems  likely  that  the  variance  of   3 (n)  would 

also  decrease  under  these  conditions,  as  indeed  it  was 
observed  to  do. 

Three  bandwidths  were  also  employed  in  the  Cauchy  case: 

1/2        1/2  1/2 

1  /  n    ,  3  /  n    and  20  /  n    ,  the  last  one   representing 

a   case   in   which  bias  in  the  estimator  f  (x)  plays  a  major 

n 


role  in  the  distribution  of  B  (n) .    The   same   five   sample 


sizes  were  used  here  for  sach  bandwidth  as  were  used  for  the 
uniform  simulations;  output  from  the  fifteen  Cauchy 
experiments  was  obtained  just  as  in  the  uniform  case. 


3.  TABULAR  RESULTS  AND  GAMMA  FITS 


Using  the  asymptotic   result   obtained   by   Bickel   and 
Rosenblatt  [5],  for  a  uniform  random  variable  the  quantity 

-1/2        fl-b(n)  f 

b(n)     {nb(n)  |       |f  (x)-1|2dx  -  M-2b(n)]   W(u)*du} 
Jb(n)     n  J 

is   asymptotically   normally   distributed   with   mean   0  and 
variance 

2W<*>  (0)  [ 1-2b(n)  ] 

-2/9 
as  n  -->  °°   ifnb(n)  — ><*>   andb(n)=   o(n    ).    For   the 

triangular  weight  function, 


/ 


W  (u)  2du  =  2 

and  W<*>(0),  the  fourth  convolution  of  W  with  itself  at 
zero,  is  302/630. 

From  the  above  expressions,  we  get 

r  fl-b(n)  i 

Er  e  (n)  ]  =  2 1  /      |f  (x)-1|2dx    -v,   2  iz2blnl 
i-Jb(n)    n  J     I  nb  \n) 

Var[  3(n)  ]  =  Varf  I   b(n|f  (x) -11*4x1  „  2W£i2J0) C, lz.1  k±HU 

LJ  b(n)    n  J  n?B  (ny 

Comparisons  of  the  simulated  values  for  the  uniform 
experiments  with  the  conjectured  ones  are  tabulated  in  Table 
III.1  (means)  and  Table  III. 2  (variances).  Especially  for 
small  bandwidth  the  agreement  between  the  asymptotic  and 
simulated  variances  is  very  good  even  for  small  n  (n  =  100). 
The  same  is  true  for  expected  value,  although  convergence  is 
slower  than  for  the  variance  and  again  slower  for  large 
bandwidth. 


TABLE  II  I. 1  Comparison  of  estimated  mean  values  and  asymptotic 
mean  values  of  6(n)   for  different  banJwidths  and  sample  sizes 


b(n)  =  3//n 

E(B(n)) 

E(B(n)}/(l-2b(n)) 

n 

Conjectured 

Computer  output 

100 

.3000 

.0089 

.0222 

.0127 

200 

.2121 

.0090 

.0157 

.0409 

500 

.1342 

.0073 

.0099 

.0075 

1000 

.0949 

.0057 

.0070 

.0058 

1500 

.0775 

.0048 
.0533 

.0057 
.0667 

.0051 

b(n)  =  l//n 

100 

.1000 

.0583 

200 

.0707 

.0405 

.0471 

.0415 

500 

.0447 

.0271 

.0298 

.0269 

1000 

.0316 

.0197 

.0211 

.0197 

1500 

.0258 

.0163 

.0172 

.0168 
.  .  .  . 

TABLE  1 1 1. 2  Comparison  of  estimated  standard  deviation  values  and 
asymptotic  standard  deviation  values  of  S(n)  for  different  band- 
widths  and  sample  sizes. 


b(n)  =  3//n 

a(6(n)) 

a(8(n))/(l-2b(nl) 

11 

Conj  ecturcd 

Computer  output 

100 

.3000 

.0113 

.0283 

.0115 

200 

.2121 

.0081 

.0141 

.0088 

500 

.1342 

.0046 

.0063 

.0047 

1000 

.0949 

.0029 

.0036 

.0030 

1500 

.0775 

.0022 
.0277 

.0026 
.0346 

.0023 

b(n)  =  l//n~ 

100 

.1000 

.0315 

200 

.0707 

.0171 

.0199 

.01S9 

500 

.0447 

.0088 

.0097 

.0092 

1000 

.0316 

.0053 

.0057 

.0056 

1500 

.0258  , 

.0040 

.0042 

.0043 

lo 


In  contrast  to  the  moments,  the  distribution   of   B  (n) 
converges   very   slowly.     The   complete  results  (reference 
[3])   reveal   that   the   histograms   and   empirical   density 
functions   of   the    3 (n) • s  are  all  skewed  to  the  right;  see 
Figures  IV. 1  to  IV. 9  for  examples. 

The  form  of  the  histograms  as  well  as  the  lag-survivor 
plots  suggested  that  the  6 (n)  statistic  is  approximately 
Gamma  (9, k)  distributed,  where  the  Gamma  density  is  given  by 

k-1  -x/9 
f  (x;  k,9|  =    Sx£Q)         e      , 

r^-Tri — 

and   the    mean   and   variance  are 

E[ X  ]   =    k9; 
Var[ X]   =    k92    . 

Accordingly,   estimates  K      and   9*   of   k   and   9   for   each 

experiment  were  obtained  from  the  sample  of  2030  6  (n)  *s. 
Shenton  and  Bowman's  almost  unbiased  estimators  for  the 
Gamma  distribution  [6]  were  used;  these  give  reasonable 
results  when  k  >  0.5,  as  in  this  case.  The  estimate  values 
are   tabulated   in  Taole  III. 3;  also  tabulated  are  estimates 

of  the  standard  deviation  of  k"  and   9   which   were   obtained 

from  the  five  sections  in  each  experiment.  A  parametric 
density  estimate  is  thus  obtained  for  the   b  (n)   sample;   it 

may   be   compared   with  the  non-parametric  estimate  f  (x)  by 

n 

examining  the  graphs  in  Section  4,  where  the   Gamna   density 

function  is  plotted  with  a  dashed  line. 


11 


TABLE   III. 3   Estimated 
Distribution  for   6 (n) . 


Parameters 


for 


Fitted 


Gamma 


DISTRIBUTION 


b(n) 


g 


UNIFORM 


CAUCHY 


100 

3.969 
±  0.206 

3. 01390 

±  .00095 

200 

5.780 
±  0.659 

3.00715 

±  .00095 

1  / 

/n 

500 

8.881 
±  0.839 

3.00311 
±  .00029 

1000 

13.011 

±  0.796 

0.00153 
±  .00008 

1500 

17. 316 

±  1.467 

0.00095 

±  .00008 

100 

1.  153 

±  0.048 

3. 00967 
±  .00058 

200 

1.718 
±  0. 17a 

0.00588 
±  .00078 

3  / 

/n 

500 

2.707 
±  0.241 

3.00281 

±  0.00026 

1000 

4.028 

±  o.2ai 

3.00145 
±  .00007 

1500 

5.248 
±  0.423 

0.00096 
±  .00008 

100 

40.337 
±  2.555 

3. 01616 

±  .00117 

200 

39.511 
±  2.347 

0.01675 
±  .00106 

1  / 

n 

500 

33.649 
±  1.820 

0. 01953 
±  .00111 

1000 

32.033 
±  3.305 

0.02059 

±   .00244 

1500 

31.712 
±  1.999 

0.020  83 
±  .00124 

100 

22.362 

±  1.488 

3. 01745 
±  .00114 

200 

32.305 
±  2.022 

0.00864 
±  .00054 

1  / 

/n 

500 

60.  147 
±  4.009 

3.00293 
±  .00022 

1000 

79. 897 

±  6.608 

3.0C157 
±  .00014 

1500 

101.  100 

±  7.783 

D. 00102 
±  .00007 

100 

9.272 
±  0.406 

3.01331 

±  .00062 

200 

12.744 
±  0.645 

3. 00709 
±  .00037 

3  / 

/n 

500 

20.701 
±  1.673 

3.00277 
±  .00022 

1000 

29.303 
±  2. 541 

3. 00140 
±  .00012 

1500 

34.265 
±  3.963 

0.00099 
±  .00010 

100 

7.  103 
±  0.217 

0.00776 
t  .00035 

200 

4. 14a 

±  0.069 

3.00619 
±  .0000^ 

20  / 

/n 

500 

3.aa5 

±  0. 161 

3.00312 
±  .00016 

1000 

4.211 

±  0.357 

0.00152 
±  .00009 

1500 

5.385 
±  0.335 

0.00095 
±  .00005 
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4.  GRAPHICAL  RESULTS  AND  GENERAL  DISCUSSION 


The  graphs  for  the  following  experiments  have  been 
reproduced  from  [3]  because  they  give  the  greatest  insight 
into  the  distribution  of  3 (n) ;  these  graphical  results  are 
more  informative  than  the  tabulated  means,  variances  and 
Gamma  fits  of  the  previous  Section. 


gur  e 

Random  Variable 

n 

b(n) 
1/2 

R 

4.1 

Uniform 

200 

3/n 
1/2 

1.718 

4.2 

Uniform 

500 

1/n 
1/2 

8.881 

4.3 

Uniform 

1500 

1/n 

17.316 

4.4 

Uniform 

200 

1/n 

1/2 
1/n 

39.511 

4.5 

Cauchy 

100 

22.362 

1/2 

4.6 

Cauchy 

100 

3/n 
1/2 

9.272 

4.7 

Cauchy 

1500 

2  0/n 

1/2 

5.385 

4.8 

Uniform 

1500 

3/n 
1/2 

5.248 

4.9 

Uniform 

100 

1/n 

3.969 

In  interpreting  the  graphs  we  can  be   guided   by   crude 

heuristics.    In   the   case  of  a  density  estimate  f  (x)  with 

n 

bandwidth  b(n)  there  is  dependence  within  a  range   of   order 

b(n)  and  an  approach  to  independence  for  points  separated  by 
a  distance  of  order  larger  than  b (n)  .   Thus  in  the   case   of 
uniform  random  variables  the  integral   B (n)  could  be  thought 
of  as  having  the  equivalent  of  the  order   of   [  1-2b (n)  ]/b(n) 
independent  summands.   In  the  first  case  (Figure  4.1;  n=20Q, 

b(n)=3//n,  Tt  =  1.713)  we  obtain 


(1  -  3/2/10)  /  [3/(10/2)  ]  =  2.71  . 
This   is   rather   small   so   that  one  does  not  expect  a  good 
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Gaussian  fit.   We  give  7.   from  the  previous  Section  since  2K 

may   be   interpreted   as   an  equivalent  number  of  degrees  of 

freedom;  the  larger  the  fitted   Tcr   the   closer   we   are   to 

normality.  In  a  loose  sense  it  is  clear  that  a  gamma  fit  is 
likely  to  be  more  appropriate  and  this  is  confirmed  by 
looking  at  the  graphs. 

In  the   second   case   (Figure   4.2;   n=500,   b(n)=1//n, 

Tc=8.881)  we  have 

(1  -  /2/10)  10/2  =  12.14  , 
which  is  a  bit  larger.  It  is  interesting  to  note  that  the 
estimated  (smoothed)  density  function  of  g(n)  gives  us 
greater  insight  apparently  in  all  cases.  Here  we  see  the 
beginning  of  an  approach  to  asymptotic  normality  though  it 
is  still  suggested  that  a  Samma  fit   might   be   appropriate. 

The  next  case  (Figure  4.3;  n=1500/  b(n)=1//n,  K=17.316)  with 

[  1  -  1/(5/T5)  ]  10/T5  =  36.73 

shows  a  closer  approach  to  normality.   It  may  be   seen   that 

the     major    departure    between    the    parametric   and 

non-parametric  density  estimates  occurs  in  the   vicinity   of 

the   mode   where   f  (x)   tends   to   fluctuate  about  the  true 

n 

value.   The  fit  in  the  tails  appears  excellent  in  all  cases. 


The  next  uniform  case  (Figure  4.4;  n=200/  b(n)=1/nr 
Jc=39.511)  is  strictly  speaking  outside  the  range  of  results 
suggested   by   the  paper  of  Bickel  and  Rosenblatt  [2].   Here 


f  (x)   is   asymptotically   compound   Poisson    rather    than 
n 
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asymptotically  normal.  Nonetheless  we  notice  that  it  looks 
as  if  a  Gaussian  fit  would  be  very  good  and  this  is 
consistent  with  the  magnitude  of  our  crude  index 

(1  -  .01)  200  =  198  . 
It  would  be  interesting  for  someone  to  prove   the   suggested 
asymptotic  normality. 

In  the  simulation  of  sampling  from  a  uniform 
distribution,  the  density  estimator  has  no  bias.  To 
investigate  the  effect  of  bias,  we  repeated  the  uniform 
experiments  for  Cauchy-distributed  random  variables, 
integrated  over  the  range  -3+b(n)  to  3-b(n).   The  first  case 

(Figure  4.5;  n=100,  b(n)=1//n,  k"  =  22.362)  has  index 

(6  -  .2)  10  =  58 
and  one  notices  that  a  Gaussian  fit  looks   very   good.    The 

next  case  (Figure  4.6;  n=100,  b(n)=3//n,  K=9.272)  has  index 

(6  -  .6)  10/3  =  17.66  , 
and  a  Gaussian  fit  looks  fair  but  not   good.    In   the   last 
Cauchy   case   one   expects   substantial   bias   (Figure   4.7; 

n=1500,  b(n)=20//n,  Jc  =  5.385)  and  the  crude  index  is 

(6  -  4A/T5)  vT5/2  =  9.6  2  . 
A   Gamma   fit   is  suggested.   Altogether  the  effects  of  bias 
don't  seem  to  be  that  extreme  when  sampling  from  the   Cauchy 
distribution  but  this  may  be  due  to  the  fact  that  the  Cauchy 
density  is  a  very  smooth  function. 

The  last  two  cases  involve  sampling  from  the  uniform 
distribution  again   but   with   different   sample   sizes  and 

bandwidths.     Figure   4.8   is   for   n=1500,   b(n|=3//n   and 

E=5.248,  while  Figure  4.9  is  for   n  =  100   and   b(n)=1//iT  for 

which  Tc=3.969. 


15 


The  problem  in  using  8  (n)  as  a  measure  of  goodness  of 
fit  in  the  non-limiting  3amma  case  is  to  determine  k  and  9. 
If  one  wishes  to  fit  the  3amma  distribution  using  the  method 
of  moments,  one  can  use  the  fact  that  the  mean  and  variance 
of  8  (n)  should  be  approximately  (on  asymptotic  grounds) 
W<2>(0)  and  b(n)W<  ♦>(()),  respectively.   One  might  then  use 

k  =  [ff££>X°l J2  r 

*  * 

0  =     k 

as  estimates  of  k  and  9.  The  results  in  Section  3  suggest 
that  this  procedure  should  produce  adequate  results  except 
when  there  is  appreciable  bias  in  the  density  function 
estimate . 
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Figure  4.1.    Distribution  of   the   statistic   6  (n)   for   a 

uniform   random   variable  with  n  =  200  and  bandwidth  3  /  /IT. 

The   solid   line   shows   the   Rosenblatt   empirical   density 
function   of   the    3(n)'s  while  the  dashed  line  is  a  fitted 

Gamma  density  function  with  E  =  1.718  and  9  =  .00588. 
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Figure   4.2.      Distribution   of   the  statistic  8  (n)  for  a 

uniform  random  variable  with  n  =  500  and  bandwidth  1   /  »/n. 

The   solid   line   shows   the   Rosenblatt   empirical   density 
function  of  the  8(n)'s  while  the  dashed  line   is   a   fitted 

Gamma  density  function  with  K  =  8.881  and  0  =  .00311. 
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Figure  4.3.    Distribution  of   the   statistic   p  (n)   for   a 

uniform  random  variable  with  n  =  1500  and  bandwidth  1  /  i/li. 
The  solid  line  shows  the  Rosenblatt  empirical  density 
function   of   the    p  (n)  ' s  while  the  dashed  line  is  a  fitted 

Gamma  density  function  with  1c  =  17.316  and  tJ  =  .00095. 
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Figure  4.4.  Distribution  of  the  statistic  8(n)  for  a 
uniform  random  variable  with  n  =  200  and  bandwidth  1  /  n. 
The  solid  line  shows  the  Rosenblatt  empirical  density 
function  of  the  3  (n)  's  while  the  dashed  line   is   a   fitted 

Gamma  density  function  with  Tc  =  39.511  and  3  =  .01675. 
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Figure  4.5.    Distribution  of   the   statistic   6 (n)   for   a 

Cauchy   random   variable   with  n  =  100  and  bandwidth  1  /  v/n. 

The   solid   line   shows   the   Rosenblatt   empirical   density 
function   of   the   0  (n)  »s  while  the  dashed  line  is  a  fitted 

Gamma  density  function  with  \   =    22.352  and  B  =  .01745. 
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Figure   4.6.      Distribution  of   the  statistic  B  (n)  for  a 

Cauchy  random  variable  with  n  =  100  and  bandwidth   3   /  \/a. 

The   solid   line   shows   the  Rosenblatt   empirical   density 

function  of  the   3 (n) 's  while  the  dashed  line   is   a   fitted 

Gamna  density  function  with  Tc  =  9.272  and  S  =  .01331. 
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Figure  4.7.    Distribution  of   the   statistic   3  (n)   for   a 

Cauchy  random  variable  with  n  =  1500  and  bandwidth  20  /  /n. 
The  solid  line  shows  the  Rosenblatt  empirical  density 
function   of   the   6  (n)*s  while  the  dashed  line  is  a  fitted 

Gamma  density  function  with  Tc  =  5.385  and  0  =  .00095. 
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Figure   4.8. 


Distribution   of   the  statistic   B  (n)  for  a 


uniform  random  variable  with  n  =  1500  and  bandwidth  3  /  )/ri. 
The  solid  line  shows  the  Rosenblatt  empirical  density 
function  of  the   3  (n)  *s  while  the  dashed  line   is   a   fitted 

Gamma  density  function  with  Tc  =  5. 248  and  3  =  .00096. 
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Figure  4.9.    Distribution  of   the   statistic   6  (n)   for   a 
uniform   random   variable  with  n  =  100  and  bandwidth  1  /  /n. 

The   solid   line   shows   the   Rosenblatt   empirical   density 
function   of   the    0  (n)  • s  while  the  dashed  line  is  a  fitted 

Gamma  density  function  with  Tc  =  3.969  and  B   =  .01390. 


25 


REFERENCES 

[1]  Bartlett,  M.S.,  (1963).  Statistical  estimation  of  density 
functions.  Sankhya,  Ser.  A25,  p.  245-254. 

[2]  Bickel,  P.J.  and  Rosenblatt,  M.,  (1973).  On  some  global 
measures  of  the  deviations  of  density  function  estimates. 
The  Annals  of  Mathematical  Statistics,  v.  1,  p.  1371-1095. 

[3]  Liu,  L.H.,  (1974).  Empirical  sampling  investigation  of  a 
global  measure  of  fit  of  probability  density  functions.  M.S. 
Thesis,  Naval  Postgraduate  School,  Monterey. 

[4]  Rosenblatt,  M.,  (1956).  Remarks  on  some  non- parametric 
estimates  of  a  density  function.  The  Annals  of  Mathematical 
Statistics,  v.  27. 

[5]  Rosenblatt,  M.,  (1971).  Curve  estimates.  The  Annals  of 
Mathematical  Statistics,  v.  42. 

[6]  Shenton,  L.R.,  and  Bowman,  K.O.,  (1973).   Comments   on   the 

Gamma    distribution    and   uses  in   rainfall   data.   Third 

Conference  °Q.  Probability   and  Statistics   in   !tmosp_heric 
Science,  AMS. 

[7]  Wegman,  E.J.,  (1972).  Non-parametric  probability  density 
estimtion:  I.  A  summary  of  available  methods.  Tech nome tries, 
v.  14. 


26 


DISTRIBUTION  LIST 

Copies 


Dean  of  Research 

Code  023 

Naval  Postgraduate  School 

Monterey,  California    93940 

Defense  Documentation  Center  2 

Cameron  Station 

Alexandria,  Virginia    22314 

Library  (Code  0212)  2 

Naval  Postgraduate  School 
Monterey,  California   93940 

Library  (Code  55)  2 

Naval  Postgraduate  School 
Monterey,  California    93940 

Professor  P.  A.  W.  Lewis  125 

Department  of  Operations  Research 

and  Administrative  Sciences 
Naval  Postgraduate  School 
Monterey,  California   9394  0 


U167349 


DUDLEY  KNOX  LIBRARY  -  RESEARCH  REPORTS 

I  I    Mil    I'lll'll  I   I       I    Ml    '   ' 


5  6853  01068148  9 


